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METHOD AND SYSTEM FOR OPTIMALLY SELECTING A WEB 
j FIREWALL IN A TCP/IP NETWORK 

1 

I Technical field of the invention 

i 

i 

The present invention relates to computer networks, and 
v 5 more particularly to a method and system in a TCP/IP network 
for optimally selecting a Web Firewall according to some 
response time and availability criteria. 

j Background art 

INTERNET 

10 The Internet is a global network of computers and 

computers networks (the "Net") . The Internet connects 
computers that use a variety of different operating systems or 
] languages, including UNIX, DOS, Windows, Macintosh, and 

j others. To facilitate and allow the communication among these 

J 15 various systems and languages, the Internet uses a language 

} referred to as TCP/IP ("Transmission Control Protocol/Internet 

\ 

\ Protocol") . TCP/IP protocol supports three basic applications 

■f 

j on the Internet : 

j • transmitting and receiving electronic mail, 

| 20 • logging into remote computers (the "Telnet") , and 

I 

^ • transferring files and programs from one computer to 

j another ("FTP" or "File Transfer Protocol") . 

WORLD WIDE WEB 

With the increasing size and complexity of the Internet, 
i 25 tools have been developed to help find information on the 
network, often called navigators or navigation systems. 
Navigation systems that have been developed include standards 
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such as Archie, Gopher and WAIS. The World Wide Web ("WWW" or 
"the Web") is a recent superior navigation system. The Web is: 

• an Internet-based navigation system, 

• an information distribution and management system for the 
5 Internet, and 

• a dynamic format for communicating on the Web. 

The Web seamlessly, for the use, integrates format of 
information, including still images, text, audio and video. A 
user on the Web using a graphical user interface ("GUI", 
10 pronounced "gooey") may transparently communicate with 
different host computers on the system, and different system 
applications (including FTP and Telnet), and different 
information formats for files and documents including, for 
example, text, sound and graphics. 

15 HYPERMEDIA 

The Web uses hypertext and hypermedia. Hypertext is a 
subset of hypermedia and refers to computer-based "documents" 
in which readers move from one place to another in a document, 
or to another document, in a non-linear manner. To do this, 

20 the Web uses a client-server architecture. The Web servers 
enable the user to access hypertext and hypermedia information 
through the Web and the user's computer. (The user's computer 
is referred to as a client computer of the Web Server 
computers.) The clients send requests to the Web Servers, 

25 which react, search and respond. The Web allows client 
application software to request and receive hypermedia 
documents (including formatted text, audio, video and 
graphics) with hypertext link capabilities to other hypermedia 
documents, from a Web file server. 

30 The Web, then, can be viewed as a collection of document files 
residing on Web host computers that are interconnected by 
hyperlinks using networking protocols, forming a virtual "web" 
that spans the Internet. 

UNIFORM RESOURCE LOCATORS 
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A resource of the Internet is unambiguously identified by 
a Uniform Resource Locator (URL) , which is a pointer to a 
particular resource at a particular location. A URL specifies 
the protocol used to access a server (e.g. HTTP, FTP,..), the 
5 name of the server, and the location of a file on that server. 

HYPER TEXT TRANSFER PROTOCOL 

Each Web page that appears on client monitors of the Web 
may appear as a complex document that integrates, for example, 
text, images, sounds and animation. Each such page may also 
contain hyperlinks to other Web documents so that a user at a 
client computer using a mouse may click on icons and may 
activate hyperlink jumps to a new page (which is a graphical 
representation of another document file) on the same or a 
different Web server. 
15 A Web server is a software program on a Web host computer 

that answers requests from Web clients, typically over the 
Internet. All Web use a language or protocol to communicate 
with Web clients which is called Hyper Text Transfer Protocol 
("HTTP") . All types of data can be exchanged among Web servers 
20 and clients using this protocol, including Hyper Text Markup 
Language ("HTML"), graphics, sound and video. HTML describes 
the layout, contents and hyperlinks of the documents and 
pages. Web clients when browsing : 

• convert user specified commands into HTTP GET requests, 

25 • connect to the appropriate Web server to get information, 

and 

• wait for a response. The response from the server can be 
the requested document or an error message. 

After the document or an error message is returned, the 
30 connection between the Web client and the Web server is 
closed. 

First version of HTTP is a stateless protocol. That is 
with HTTP, there is no continuous connection between each 
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client and each server. The Web client using HTTP receives a 
response as HTML data or other data. This description applies 
to version 1.0 of HTTP protocol, while the new version 1.1 
break this barrier of stateless protocol by keeping the 
5 connection between the server and client alive under certain 
conditions . 

BROWSER 

After receipt, the Web client formats and presents the 
data or activates an ancillary application such a sound player 
10 to present the data. To do this, the server or the client 
determines the various types of data received. The Web Client 
is also referred to as the Web Browser, since it in fact 
browses documents retrieved from the Web Server. 

DOMAIN NAMES 

15 The host or computers names (like www.entreprise.com) are 

translated into numeric Internet addresses (like 194.56.78.3), 
and vice versa, by using a method called DNS ("Domain Name 
Service") . DNS is supported by network-resident servers, also 
known as domain name servers or DNS servers. 

20 INTRANET 

Some companies use the same mechanism as the Web to 
communicate inside their own corporation. In this case, this 
mechanism is called an "Intranet". These companies use the 
same networking/transport protocols and locally based Web 

25 servers to provide access to vast amount of corporate 
information in a cohesive fashion. As this data may be private 
to the corporation, and because the members of the company 
still need to have access to public Web information, to avoid 
that people not belonging to the company can access to this 

30 private Intranet coming from the public Internet, they protect 
the access to their network by using a special equipment 
called a Firewall. 
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FIREWALL 

A Firewall protects one or more computers with Internet 
connections from access by external computers connected to the 
Internet, A Firewall is a network configuration, usually 

5 created by hardware and software, that forms a boundary 
between networked computers within the Firewall from those 
outside the Firewall. The computers within the Firewall form a 
secure sub-network with internal access capabilities and 
shared resources not available from the outside computers. 

10 Often, a single machine, on which the Firewall is, allows 

access to both internal and external computers. Since the 
computer, on which the Firewall is, directly interacts with 
the Internet, strict security measures against unwanted access 
from external computers are required. 

15 A Firewall is commonly used to protect information such 

as electronic mail and data files within a physical building 
or organization site. A Firewall reduces the risk of intrusion 
by unauthorized people from the Internet, however, the same 
security measures can limit or require special software for 

20 those inside the Firewall who wish to access information on 
the outside. A Firewall can be configured using "Proxies" or 
"Socks" to designate access to information from each side of 
the Firewall. 

PROXY SERVER 

25 A HTTP Proxy is a special server that typically runs in 

conjunction with Firewall software and allows an access to the 
Internet from within a Firewall. The Proxy Server : 

• waits for a request (for example a HTTP request) from 
inside the Firewall, 

30 • forwards the request to the remote server outside the 

Firewall, 

• reads the response, and 

• sends the response back to the client. 
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A single computer can run multiple servers, each server 
connection identified with a port number- A Proxy Server, like 
an HTTP Server or a FTP Server, occupies a port* Typically, a 
connection uses standardized port numbers for each protocol 

5 (for example, HTTP = 80 and FTP = 21) . That is why an end user 

has to select a specific port number for each defined Proxy 
Server. Web Browsers usually let the end user set the host 
name and port number of the Proxy Servers in a customizable 
panel. Protocols such as HTTP, FTP, Gopher, WAIS, and Security 

10 can usually have designated Proxies. Proxies are generally 
preferred over Socks for their ability to perform caching, 
high-level logging, and access control, because they provide a 
specific connection for each network service protocol. 

SOCKS 

15 Socks Server (also called Socks Gateway) is also a 

software that allows computers inside a Firewall to gain 
access to the Internet. Socks is usually installed on a server 
positioned either inside or on the Firewall. Computers within 
the Firewall access the Socks Server as clients to reach the 

20 Internet. Web Browsers usually let the end user set the host 
name and port number of the Socks hosts (servers) in a 
customizable panel. On some Operating Systems, the host is 
specified in a separate file (e.g. socks. conf file). As the 
Socks Server acts a a layer underneath the protocols (HTTP, 

25 FTP, ..), it cannot cache data (as Proxy does), because it 
doesn't decode the protocol to know what kind of data, it is 
transferring . 

OPTIONS 

The Web Browser often proposes the end user to select 
30 between the different options u No Proxies", "Manual Proxy 
Configuration", or "Automatic Proxy Configuration" to 
designate the conduit between his computer and the Internet. 
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Users with a direct connection to the Internet should use 
the default/ which is "No Proxies". 

If the Intranet is protected by one or several Firewalls, 
the end user may : 

• select one of these Firewalls as the elected Proxy, by 
entering its host name into the "Manual Proxy 
Configuration", or 

• automatically refers to the enterprise policy in terms 
of Proxies attribution between locations, by pointing 
to a common configuration file in a remote server. This 
is done by choosing the "Automatic Proxy Configuration" 
and by providing the Web Browser with the unique 
address of the common configuration file ("Universal 
Resource Locator" or "URL") located in the remote 
server. 



Today, most of the Web Browsers are configured to forward 
all requests, even requests for internal hosts, through the 
Socks Firewall. So when the end user wants to have access to 
an internal Web-based application, his request travels to the 

20 Firewall, and is then reflected back into the internal 
network. This sends internal traffic on a long path, puts 
extra load on the Firewall and on the network, and worst of 
all, slows down the response time the end user sees from the 
applications and Web pages he is trying to access. This is 

25 called "non flexible" Socks access (when everything goes via 
the Socks Server) . 

MANUAL PROXY CONFIGURATION 

The Manual Proxy configuration in the Web Browser is 
simple to process, but its main drawback is that the Firewall 
30 (or Proxy) selection is then static. There is no dynamic 

criterion for the Firewall selection, such as selection of the 
Firewall providing the best response time. Firewall failures 
require a manual reconfiguration of the navigation software to 
point to another active Firewall, since the manual 
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configuration usually only allows the definition of one single [ 

Firewall per protocol with no possibility to pre-conf igure a j 

backup Firewall. In addition to the manual proxy configuration j 

in the Web Browser, external procedures can be used to provide 1 

5 some kind of robustness in the Firewall selection. They rely 
for instance on the use of multiple Firewalls having the same 

name defined as aliases in the Domain Name Server (DNS) . But > 

this technique based on alias definition still has drawbacks I 

since for instance the DNS is not always contacted for name [ 

10 resolution by Web Clients caching locally the name resolution, ^ 

Other techniques using external hardware equipment such as , 

load and request dispatcher provide more robustness and load | 

balancing, but still have drawbacks such as the need for j 

additional and costly hardware. ? 

15 AUTOMATIC PROXY CONFIGURATION 

Automatic Proxy Configuration (or also referred to as 

"autoproxy") can set the location of the HTTP, FTP, and Gopher j 

Proxy every time the Web Browser is started. An autoproxy | 

retrieves a file of address ranges and instructs the Web I 

20 Browser to either directly access internal IBM hosts or to go $ 

to the Socks Server to access hosts on the Internet. £ 

Automatic Proxy Configuration is more desirable than \ 
simple Proxy Server Configuration in the Web Browser, because 
much more sophisticated rules can be implemented about the way 
25 Web pages are retrieved (directly or indirectly) . Automatic 

Proxy Configuration is useful to users, because the Web ^ 
Browser knows how to retrieve pages directly if the Proxy 

Server fails. Also Proxy requests can be directed to another j 

or multiple Proxy Servers at the discretion of the system £ 
30 administrator, without the end user has to make any additional 
changes to his Web Browser configuration. In general, these 
Proxy configuration files (also called autoproxy code) are 

usually written in Javascript language. Autoproxy facility can t 

also contain a file of address ranges for instructing the Web { 
35 Browser to either directly access internal hosts or to go to 
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the Socks Server to access hosts on the Internet. The Socks 
Server protects the internal network from unwanted public 
access while permitting access of network members to the 
Internet. One of the drawbacks of this "autoproxy" mechanism 
is that there is no proactive Firewall failure detection nor 
response time consideration. 

More explanations about the domain presented in the above 
sections can be found in the following publications 
incorporated herewith by reference: 

• xx Java Network Programming" by Elliotte Rusty Harold, 
published by O'Reilly, February 1997. 

• "Internet in a nutshell" by Valerie Quercia, published by 
O'Reilly, October 1997. 

• ^Building Internet Firewalls" by Brent Chapman and 
Elizabeth Zwichky, published by O'Reilly, September 1995. 

PROBLEM 

The problem to solve is to provide an optimized Web 
access, with a dynamic Proxy or Socks Server selection to get 
the best response time, and a detection of failures in Proxy 
20 or Socks Server to prevent Web service disruption. The current 
solutions address this problem partially: 

• Web Browsers can be manually configured with the target 
Proxy or Socks Server. The main drawbacks of this solution 
are the following : 
25 • There is no dynamic Proxy/Socks Server selection. A 

manual reconfiguration of the Web Browser upon 
Proxy/Socks Server failure is required. 
• Only a "manual" load balancing through the Web Browser 
static configuration is provided. 
30 • Proxy/Socks Server names must be known and manually 

configured by end users. 
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• Web Browsers can be configured with their autoproxy feature, 
using a static list of target Proxy/Socks Servers downloaded 
from a dedicated autoproxy URL (Uniform Resource Locator) 
system. The main drawback of this solution is the following; 

5 

• There is no response time consideration in the 
Proxy/Socks Server selection, nor efficient Proxy/Socks 
Server failure detection (i.e. Web Browser waits for 
time-out before switching to backup, even at initial 
10 autoproxy loading) in the Proxy/Socks Server selection. 

An alternate to these current solutions is to cluster the 
Proxy/Socks Servers using an external dispatcher system acting 
as single logical access point. All Web Browsers are then 
manually configured with the name of that external dispatcher 

15 system (as the target Proxy/Socks Server) which then routes 
the traffic to a selected Proxy/Socks Server. An example of 
such a dispatcher is for example the IBM Interactive Network 
Dispatcher product. More information concerning this product 
can be found in IBM's publication entitled "Interactive 

20 Network Dispatcher VI. 2 - User's Guide" GC31-8496-01 
incorporated herewith by reference. Although a dispatcher 
oriented solution allows an efficient load balancing in most 
cases, its main drawback is that additional dedicated system 
or specific hardware is required, and that the external 

25 dispatcher name has to be manually configured by the end users 
in their Web Browsers. 

Objects of the Invention 

• The object of the present invention is to optimize 
Proxy/Socks Server selection by using availability and 
30 response time criteria. 
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• It is a further object of the present invention to optimize 
the Web service performance by integrating a response time 
factor to the Proxy/Socks Server selection* 

• It is another object of the present invention to minimize 
5 Web service interruption and thus to insure a better service 

availability by automatically detecting Proxy/Socks Servers 
failures . 



Summary of the Invention 



10 The present invention relates to dynamic autoproxy 

configuration and more particularly to a method and system for 
optimizing selection of a Proxy/Socks Server according to some 
response time and availability criteria- The invention rests 
on a dynamic autoproxy mechanism using availability and 

15 response time probes. 



The present invention also relies on probes retrieving 
\ well known HTML pages through each Proxy/Socks Server, 

i 

measuring associated response time, detecting Proxy/Socks 
j failures and degradation of response time. 

i 

? 20 The present invention also uses a CGI (Common Gateway 

j Interface) program for dynamically creating autoproxy code (in 

_] a preferred embodiment Javascript code) on an autoproxy URL 

| (Universal Resource Locator) system for selecting the 

|" Proxy/Socks Server using availability and response time 

;! 25 information provided by probes. 

1 

\ The present invention fixes the drawbacks of the existing 
current solutions by integrating dynamic Proxy/Socks 

i availability and response time selection criteria to the 

■ autoproxy mechanism. 

jj 30 The present inventions provides the following advantages: 
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• Early detection of Proxy/Socks Servers failures provides a 
high Web service availability. 

• Integration of a response time factor to the Proxy/Socks 
Server selection optimizes the Web service performances. 

5 • Induced HTTP survey traffic is minimized by running 
availability and response time probes from a single 
autoproxy URL system (compared with running the probes on 
each Web Browser system) . 

• Integration of response time degradation in the probes 
10 achieves a proactive Proxy/Socks Servers failure detection. 

• Periodical dynamic update of "best" Proxy/Socks Server can 
be provided to Web Browser. 

• Useless traffic to failing Proxy/Socks Server is minimized 
since Proxy/Socks Servers are excluded from list of 

15 available target servers upon failure detection. 

• No additional or specific hardware is required. 

• Ease of Web Browser configuration provided to mobile users 

(Web Browser is configured once) . 

• Web Browser performances are not degraded because 
20 availability and response time probes are not processed 

within the downloaded autoproxy code (Javascript code) but 
in the autoproxy URL system. 



Drawings 

The novel and inventive features believed characteristics 
25 of the invention are set forth in the appended claims. The 
invention itself, however, as well as a preferred mode of use, 
further objects and advantages thereof, will best be 
understood by reference to the following detailed description 
of an illustrative detailed embodiment when read in 
30 conjunction with the accompanying drawings, wherein : 
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• Figure 1 is a general logical view of an end user system 
interfacing a Web Browser for accessing the World Wide Web 
according to prior art . 

• Figure 2 is a general physical view of the set-up shown in 
5 Figure 1, according to prior art . 

• Figure 3 is a logical view of the availability and response 
time probes external flows according to the present 
invention . 

• Figure 4 is a flow chart showing the internal logic flow of 
10 the availability and response time probe introduced in 

Figure 3 according the present invention . 

• Figure 5 is a physical view of the logical environment 
described in Figure 3 according to the present invention. 

• Figure 6 is a view of the data flows associated with the 
15 entities depicted in Figure 5, according to the present 

invention . 

• Figure 7 depicts the storage of the availability and 
response times probes measurements, according to the present 
invention. 

20 • Figure 8 is a flow chart of the program running on the 
autoproxy URL (Universal Resource Locator) system, according 
to the present invention. 



Preferred embodiment of the invention 

The present invention relies on dynamic autoproxy 
25 configuration and more particularly to a method and system for 
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selecting a Proxy/Socks Server according to some response time 
and availability criteria. It rests on a dynamic autoproxy 
mechanism using availability and response time probes. It 
relies on probes retrieving well known HTML pages through each 
5 Proxy/Socks Server, measuring associated response time, 
detecting Proxy/Socks failures and degradation of response 
time . 

It also uses a CGI (Common Gateway Interface) program for 
dynamically creating autoproxy code (in a preferred embodiment 
10 Javascript code) on an autoproxy URL (Universal resource 
locator) system for selecting said Proxy/Socks Server. 

LOGICAL VIEW OF A END USER ACCESSING THE WORLD WIDE WEB 

Figure 1 shows a user system with a user interface (102) 
running a program known as a Web Browser (101) which enables 

15 access to the World-Wide-Web (WWW) . The WWW content is 
transferred using the HTTP protocol. HTTP requests and 
responses are going to and from the Web Browser program (101) 
and a destination Web Server (103) containing the WWW content 
the user tries to access. The Firewall (104) between the Web 

20 Browser (101) and the Web Server (103) acts as an intermediary 
HTTP Proxy forwarding the HTTP requests and responses to their 
destination. The Web Browser program (101) makes an HTTP 
request to the Firewall (104) and the Firewall forwards the L 
request to the destination Web Server (103) . The flow in the \ 

25 reverse direction is the HTTP response which again goes via 
the Firewall (104) on its way to the Web Browser (101) . In 
this way the Firewall can limit the traffic to the 
transactions it is configured to allow (based on some defined 
security and access control policy) . The Firewall hence 

30 protects the network where Web Browser is located. 

GENERAL PHYSICAL VIEW OF AN END USER ACCESSING THE WWW 

Figure 2 is a physical view of the set-up shown logically 
in Figure 1. In this particular example, the Web Browser j 
(201) runs on a system attached to an Intranet (202) . The j 

i 
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Firewalls (203) that protect the Intranet attach both the 
(private) Intranet (202) and the (public) Internet (204). the 
destination Web Server (205) also connects to the Internet, 
This is the environment where the Web Browser, Firewalls, and 

5 Web Server perform their function when the user is "browsing" 
the Internet WWW. It is important to note the fact the 
Firewalls attach two networks and hence are able to act as the 
intermediary for communications between the two networks. 
Multiple Firewalls are often uses in order to provide some 

10 degree of access robustness and load sharing. 

LOGICAL VIEW OF AVAILABILITY AND RESPONSE TIME PROBES 

The domain of the invention is the one described in 
Figure 1 and Figure 2, where a user within an Intranet wants 
to access the World-Wide-Web using a Web Browser, and where 
15 the Intranet network is protected from the Internet by several 
Firewalls playing role of so called HTTP Proxies (Figure 2) . 
The issue is to select the "best" Proxy/Socks Server to insure 
an optimized availability and response time of the service to 
the end user. To automatically optimize this selection, a 
20 software component called a "WWW availability and response 
time probe" is introduced. Its role is to provide selection 
criteria. As shown in Figure 3, this data is gathered by 
measuring the response time for requesting a specific content 
of a well known Web Server. The induced HTTP survey traffic is 
25 minimized, by running the availability and response time 
probes from single autoproxy URL system (versus running the 
probes on each Web Browser client system) . 

Figure 3 demonstrates the function of a flexible WWW 
availability and response time probe and the way it can be 
30 used to gather measurements on the availability and response 
time of both HTTP Proxies and Socks Servers. The upper part of 
Figure 3 details the interaction of the probe with an HTTP 
Proxy Server (304) ♦ The client system (302) that runs the 
probe (configured to test proxies) basically requests Web 
35 content (page) from the Web Server (307) via the Proxy Server 

FR 9 99 001 



NSDOCID-<E1 9948001103> 



(304) similar to the process shown in Figure 1. The HTTP 
request in this case represents an "HTTP survey flow" (303) to 
the Proxy Server. The Proxy Server forwards (306) the request 
to the Web Server (via the Firewall (305) which is not 
5 depicted) . The client system times how long the 
request/response HTTP survey flow takes and uses this 
information as a measurement of the response time and 
availability via the tested Proxy Server (for the Web content 
samples that were tested) . If the client system is also the 

10 autoproxy URL system ( 301 ) then this measurement information 
for each Proxy Server can be used to work out a sense of the 
"best" Proxy Server to use. This can then be encoded in the 
autoproxy URL that the Web Browser programs use to work out 
their correct Proxy Server to use. 

15 The lower portion of the Figure 3 shows a similar 

arrangement but in this case the measurements data is being 
gathered for a Socks Server (Gateway) access method. Again a 1 
client probe (309) makes an HTTP request that represents an 
"HTTP survey flow" (310) which travels via the Socks Server f 

20 (311) and then onto (312) the destination Web site (313) . This 

HTTP request is for a set target URL (308) that is known to 
exist on the target Web Server. Again it is the timing of how 
long this survey takes that provides the measurements data 
that can be used to generate an autoproxy URL that takes into 

25 account the relative performance of a set of Socks Server (or 

in the case above, HTTP Proxies) . 1 

Obviously if there is no response to the HTTP survey 
flow, then the particular Proxy or Socks Server being tested 
can be marked as unavailable. In this way the autoproxy URL 

30 can be used to not select Proxy or Socks Servers that do not ? 

work . ; 

| 

INTERNAL LOGIC OF THE AVAILABILITY AND RESPONSE TIME PROBES 

The internal mechanism of the probe itself is described 
in Figure 4. The probe simulates a Web Client, by requesting 
35 through an HTTP connection a Web page from a target URL 
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through the target Proxy/Socks Server (using its host name and 
port as a reference). The Web page is retrieved either through 
a normal HTTP connection, or through a socksified flow (a flow 
through a Socks Server) . Typically, normal flow is used to 
5 retrieve a Web page from a Proxy Server or from a Web Server, 
while socksified flow is used to retrieve a Web page through a 
Socks Server. Then, the probe basically checks that the Web 
page : 

• is received within an allowed amount of time in seconds, and 

10 

• contains a specific keyword to make sure that the received 
page is correct. 

When these two conditions are fulfilled, the Web page 
retrieval is successful. 

15 Finally, the probe returns either the associated response 

time in seconds (successful retrieval) or a failure return 
code. This mechanism retrieves one or multiple target Web 
pages. When multiple Web pages are retrieved, the probe 
program sequentially tests each Web page until one Web page 

20 retrieval is successful or all Web page retrievals fail. 
Probes : 

• retrieve well known HTML pages through each Proxy/Socks 
Server, 

• measure associated response time, and also 

25 • detect Proxy/Socks Server failures and response time 
degradation. 

Figure 4 is a flow chart showing the internal logic flow 
of the WWW availability and response time probe introduced in 
Figure 3. 

30 • The first thing the probe program does is to start a timer 
(401) . 

FR 9 99 001 



SDOCID: <E1 9948001 103> 



• Next the probe program attempts to establish a connection 
(402) with the target Web Server to retrieve a Web page at 
the target URL (Universal Resource locator) . The probe, 
program establishes the connection according to the way it 

5 has been configured e.g. via an HTTP Proxy Server, via a 

Socks Server (Gateway) or directly. 

• If the attempt for establishing the connection is 
unsuccessful, the probe program immediately goes into error 
mode (408) . An error value is returned (407) by the probe 

10 program indicating that the connection is not possible. 

• If the attempt for establishing the connection is 
successful, then the Web page (403) is retrieved by the 
probe program. 

• The probe program then closes the connection (404) pursuant 
15 to the normal HTTP protocol procedure. 

• To ensure that the Web page has been correctly retrieved, 
the probe program then searches for known keywords (405) 
that are expected to be in the Web page. 

• If the keyword is found (406) in the Web page, then the Web 
20 page retrieval is successful. The timer is stopped and the 

correct response time for the operation is returned. By 
storing and integrating a short historic of the measured 
response time over time, the probe program can detect and 
return any response time degradation, thus enabling an 
25 anticipation of the Proxy/Socks Servers failures. 

• If however the correct keyword is not found (407) in the Web 
page, then the Web retrieval is unsuccessful and again an 
error value is returned. The type of event that might 
trigger this sort of error is when the connection is 

30 r successfully established but Web page with an error is 

retrieved. 

• The action whereby the probe goes into retry mode (409) 
occurs only when the probe is configured to try multiple 
destination URL's as opposed to a single URL. This adds some 

35 robustness to the testing of the probe and hence insulates 
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I 



it somewhat from one-off network "glitches" (e.g. dropped 
connections etc.) . 



PHYSICAL VIEW OF AVAILABILITY AND RESPONSE TIME PROBES 
EXTERNAL FLOWS 

5 The probes are used by various components and in various 

flows (Figures 5 and 6) in order to provide the Web Browser 
with the best Proxy/Socks Server. The data gathered by the 
probes are indirectly downloaded to the Web Browser by using 
an autoproxy mechanism. The present invention allows a 

10 software implementation with no additional or specific 
hardware . 

The output from the probe is stored on the autoproxy URL 
system as shown in Figure 7 and used to create the autoproxy 
code (Javascript code in a preferred embodiment) . There is no 

15 extra process inside the code. Web Browser performances are 
not degraded because availability and response time probes are 
not processed within the downloaded autoproxy code (Javascript 
code) but in the autoproxy URL system. 

A CGI (Common Gateway Interface) program dynamically 

20 creates the autoproxy code as shown in Figure 8 with the 
availability and response time information provided by probes. 
The use of response time and availability criteria for 
selecting a Proxy/Socks Server by the probes is fully 
compatible, and can be combined, with existing criteria such 

25 as client's origin IP subnet. 

The use of response time and availability criteria also 
provides a proactive Proxy/Socks Servers failure detection 
through the integration of response time degradation. The Web 
Browser can be periodically and dynamically updated with a new 

30 selection of the "best" Proxy/Socks Server using : 

• "refresh" tag in the autoproxy code, 

• external code (or Java applet) , or 

• a new feature in the Web Browser for periodically and 
automatically refreshing the autoproxy code. 
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Another positive consequence is the minimization of 
the useless traffic to failing Proxy/Socks Server since 
Proxy/Socks Servers are excluded from list of available target 
servers upon failure detection. Since an autoproxy mechanism 
5 is used, there is no need for manually updating the manual 
proxy configuration in the Web Browser in case of Proxy/Socks 
Server failure. Proxy/Socks Servers names or locations don't 
need to be known and configured by the end user, thus 
providing for instance a seamless service for mobile users. 

10 Figure 5 is a physical view of the logical environment 

described in Figure 3. The Web Browser (501) attached to the 
Intranet (502) is configured to use an autoproxy URL to 
determine which Proxy/Socks Server (Firewall) (503) to use for 
having access to the Internet (504) and the destination Web 

15 Server (505) . The system where the autoproxy URL resides (506) 
also runs the availability and response time probes (507) 
configured to test the Proxy/Socks Servers. The autoproxy URL 
uses the CGI (Common Gateway Interface) (508) to dynamically 
generate the autoproxy code of the autoproxy URL. The 

20 autoproxy code is based on the information gathered by the 
availability and response time probes. In this way the Web 
Browser is configured with 

• an available Proxy/Socks Server, and 

• what is deemed the best Proxy/Socks Server. 

25 DATA FLOWS OF AVAILABILITY AND RESPONSE TIME PROBES 

Figure 6 is a view of the actual data flows associated 
with the entities depicted in Figure 5. Again, the Web Browser 
(601) attached to the Intranet (602) is configured to use an 
autoproxy URL to determine which Proxy/Socks Server (Firewall) 
30 (603) to use for having access to the Internet (604) and the 
destination Web Server (605) . The Web Browser has access (609) 
to the autoproxy URL system (606) to first determine which 
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Proxy/Socks Serverl it should use. The Web Browser can be 
periodically and dynamically updated with the "best" 

Proxy/Socks Server using : 

• the "refresh" tag in the autoproxy code, 
5 • an external code (or Java applet) , or 

• a Web Browser new feature for periodically and automatically 
refreshing the autoproxy code. 

The system where the autoproxy URL resides and which runs 
the availability and response time probes (607), uses the CGI 

10 (Common Gateway Interface) (608) to dynamically generate the 

autoproxy code (610) of the autoproxy URL. The autoproxy code 
is based on the information gathered by the availability and 
response time probes that have tested the Proxy/Socks Servers 
via the HTTP survey flows (611 and 612) described in Figure 3. 

15 In this way the Web Browser ends up with what is deemed the 

best Proxy/Socks Server. 

INTERNAL STORAGE OF AVAILABILITY AND RESPONSE TIME PROBES 

Figure 7 depicts the internal storage within the 
autoproxy URL system of the information retrieved by the 

20 availability and response times probes (701) . Each probe 
updates (702) a table in the autoproxy URL system (703) with 
the measurements of each Proxy/Socks Server it tests- In this 
way the table contains the current state of all Proxy/Socks 
Servers that are candidate to be selected and used by the Web 

25 Browser. At configurable or periodical time intervals, probes 
test again the Proxy/Socks Servers (704) and the cycle again 
is repeated. 

PROGRAM RUNNING AT AUTOPROXY URL SYSTEM 

Figure 8 again refers to the internal logic of the 
30 program running on the autoproxy URL system. 
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° The autoproxy URL system is initially contacted by a Web 
Browser (801) wanting to "know" which is the best 
Proxy/Socks Server (or Firewall) to use. This is for 
instance achieved by selecting the Automatic Proxy 

5 Configuration option in the Web Browser and by providing 

information such as the URL of the autoproxy code. 
0 The autoproxy URL system activates (802) the CGI (Common 
Gateway Interface) program (via the Web Server CGI 
extensions) . The CGI program has access to all standard CGI 

10 variables including the IP (Internet Protocol) address of 

the Web Browser. 

° The CGI program selects (803) the best Proxy/Socks Server 
for the client system (Web Browser) based on both the IP 
address of the Web Browser (obtained as a CGI variable) and 
15 the information generated by the availability and response 

time probes for each Proxy/Socks Server and stored in the 
table of the autoproxy URL system (807) . The IP address is 
used to add a geographical criteria to the Proxy/Socks 
Server selection. For instance, if two Proxy/Socks Servers 
20 provide the same response time (one in the US, the other one 

in Europe) , the closest Proxy/Socks Server is preferred (the 
one in Europe if the Web Browser is in Europe) . 
° To improve the robustness of the Proxy/Socks Server 
selection, the CGI program selects a best "backup" 
25 Proxy/Socks Server for the Web Browser (804) . This "backup" 

Proxy/Socks Server is automatically used by the Web Browser 
after it times out attempting to use what it thinks is the 
"best" Proxy/Socks Server. Again this "backup" Proxy/Socks 
Server is selected using both the IP address of the Web 
30 Browser (obtained as a CGI variable) and the information 

generated by the availability and response time probes for 
each Proxy/Socks Server and stored in the table of the 
autoproxy URL system (807) . 
° Once the CGI program has selected the best and backup 
35 Proxy/Socks Servers, it created the autoproxy code (805) . 

This code is generally made of Javascript language. 
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• Once the autoproxy code has been created, the autoproxy URL 
system downloads it to the Web Browser (806) via standard 
HTTP protocol as any other output of a CGI program. 

While the invention has been particularly shown and described 
5 with reference to a preferred embodiment/ it will be 
understood that various changes in form and detail may be made 
therein without departing from the spirit, and scope of the 
invention. 
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Claims 



1. A method for dynamically selecting a firewall server 
(603) for a web client (601), in particular a web browser 
(601), in a Transmission Control Protocol/Internet 
5 Protocol (TCP/IP) network comprising a plurality of 

firewall servers (503), 
said method comprising the steps of : 



• measuring performance and availability of each 
firewall server (603) using measurement probes 

10 (607); 

• dynamically selecting a firewall server according to 
the performance and availability measurements 
(607) . 



15 2. The method according to the preceding claim wherein the 

step of measuring the performance and availability of 
each firewall server (603) using measurement probes (607) 
comprises the further step of: 

• measuring the response time needed for retrieving 
20 from a web server (605) known information, in 

particular one or a plurality of known web pages, 
through each firewall server (603); 



3. The method according to the preceding claim wherein the 
step of measuring the 
25 response time comprises the further steps of: 

• establishing (402) a connection with the web server 
(605) through each firewall server (603); 
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• retrieving (403) the one or a plurality of known web 
pages from the web server (605); 

• checking (405) that the retrieved one or plurality of 
web pages contain one or a plurality of known keywords. 

5 

4. The method according to any one of the preceding claims 
wherein the step of measuring the performance of each 
firewall server (603) using measurement probes (607) 
10 comprises the further step of: 

• comparing for each firewall server said measured 
response time with previous measured response times; 

• determining for each firewall (603) the degradation 
or the amelioration of the measured response time. 



15 5* The method according to any one of the preceding claims 

wherein the step of measuring the availability of each 
firewall server using measurement probes (607) comprises 
the further step of: 

• detecting failures on each firewall server; 

20 • excluding firewall servers in failure from the step 

of selecting a firewall server. 



6. The method according to any one of the preceding claims 
25 wherein said firewall server (603) is a proxy server 

(304) or/and a socks server (311) . 
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7. The method according to any one of the preceding claims 
comprising the further steps of: 



• processing performance and availability measurements 

(607) from a single universal resource locator 
5 (URL) system (606) ; 

• dynamically creating a configuration file based on 
the performance and availability measurements, 
preferably in Javascript language, on said universal 
resource locator (URL) system (606) for selecting 

10 said firewall server (603) . 



8. The method according to any one of the preceding claims 
wherein the step of dynamically creating a configuration 
file is processed by a common gateway interface (CGI) 
(608) on said universal resource locator (URL) system 
15 (606) . 



9. The method according to any one of the preceding claims 
wherein the step 

of selecting a firewall server (603) comprises the 
further step of: 

20 • downloading the configuration file from the 

universal resource locator (URL) system (606) to the 
web client, in particular to the web browser (601) . 



10. The method according to any one of the preceding claims 
wherein the steps of measuring performance and 
25 availability and of dynamically selecting a firewall 

server (603) are periodically processed in the universal 
resource locator (URL) system (606) and the configuration 
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file created by the common gateway interface (608) (CGI) 
is periodically downloaded to the web client (601) . 

11. The method according to any one of the preceding claims 
comprising the further steps of: 

5 • pre-selecting a backup firewall server (603) in a 

background process; 

• switching to said backup firewall server in case of 
failure of the selected firewall server. 

12. The method according to any one of the preceding claims 
0 wherein step of selecting a firewall server according to 

performance and availability measurements comprises the 
further step of: 

• selecting the firewall server according to the 
Internet Protocol (IP) address. 

■5 13. A system comprising means adapted for carrying out the 

method according to any one of the preceding claims. 
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\ METHOD AND SYSTEM FOR OPTIMALLY SELECTING A WEB 

FIREWALL IN A TCP/IP NETWORK 



Abstract 

The present invention relies on dynamic autoproxy 
5 configuration and more particularly to a method and system for 
selecting a Proxy/Socks Server according to some response time 
and availability criteria. It rests on a dynamic autoproxy 
mechanism using availability and response time probes. It 
relies on probes retrieving well known HTML pages through each 
10 Proxy/Socks Server, measuring associated response time, 
detecting Proxy/Socks failures and degradation of response 
time . 

It also uses a CGI (Common Gateway Interface) program for 
dynamically creating autoproxy code (in a preferred embodiment 
15 Javascript code) on an autoproxy URL (Universal resource 
locator) system for selecting said Proxy/Socks Server. 

Figure 6 



5 
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General physical view of a end user accessing the World-Wide-Web 
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Logical view of availability and response time probes external flows 
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Flow chart of internal logic of availability and response time probe 



No 



Establish 
connection with 
target URL 
(402L 



1 


Yes 

r 


Retrieve web page (403) 




r 


Close Connection (404) 




Return: KO, Not Available (407) 



BNSDOCID<E1 9948001 103> 



FIG. 4 



IX^•^^:-^:•:*x•^:•:•:■:-:■:■:■:■:•^:-:■^-:•:::::::::::::^: 



5/8 



[DRAW/ 



Physical view of availability and response time probes external flows 
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Data flows of availability and response time probe 
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Internal storage of WWW availability and response time probes 
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Flow chart of the program running at autoproxy URL system 
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